Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Fix MAU reaping where reserved users are specified. #6168

Merged
merged 7 commits into from
Oct 11, 2019

Conversation

neilisfragile
Copy link
Contributor

The bug I am fixing is quite hard to explain, but in short, in some cases the reaping background task would delete too many users from the monthly_active_user table if reserved users had been specified.

@neilisfragile neilisfragile changed the title Fix mua reaping where reserved are specified. Fix mau reaping where reserved are specified. Oct 4, 2019
@neilisfragile neilisfragile requested a review from a team October 4, 2019 11:30
for i in range(initial_users):
user = "@user%d:server" % i
email = "user%[email protected]" % i
self.store.upsert_monthly_active_user(user)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.store.upsert_monthly_active_user(user)
self.get_success(self.store.upsert_monthly_active_user(user))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

etc for other store functions

synapse/storage/monthly_active_users.py Outdated Show resolved Hide resolved
query_args = []
query_args.extend(self.reserved_users)
query_args.append(safe_guard)
query_args.extend(self.reserved_users)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can write this as query_args = [*self.reserved_users, safe_guard, *self.reserved_users]


# Update reserved_users to ensure it stays in sync, this is important
# for reaping.
self.reserved_users = users
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels wrong, as there's nothing to say that get_registered_reserved_users_count is going to get called to make this accurate. Can't we query the database to get the reserved_users when we need it? (And potentially cache it?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, opted not to cache because method is called every hour and ought to be light.

else:
safe_guard = max_mau_value - len(self.reserved_users)
# Must be greater than zero for postgres
safe_guard = safe_guard if safe_guard > 0 else 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not just safe_guard = max(max_mau_value - len(self.reserved_users), 0). Also can we rename it to something like num_users_to_remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have renamed, but the sense is the opposite, because it is number of non reserved users to save.

@richvdh richvdh changed the title Fix mau reaping where reserved are specified. Fix MAU reaping where reserved users are specified. Oct 8, 2019
Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of formatting nits, otherwise good to go

",".join(questionmarks)
)
query_args.extend(reserved_users)
sql = base_sql + """ AND user_id NOT IN ({})""".format(question_marks)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sql = base_sql + """ AND user_id NOT IN ({})""".format(question_marks)
sql = base_sql + " AND user_id NOT IN ({})".format(question_marks)

ORDER BY timestamp DESC
LIMIT ?
)
AND user_id NOT IN ({})""".format(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
AND user_id NOT IN ({})""".format(
AND user_id NOT IN ({})
""".format(

user_id = yield self.hs.get_datastore().get_user_id_by_threepid(
tp["medium"], tp["address"]
)
if user_id:
count = count + 1
return count
users = users + (user_id,)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd probably be easier to return a list rather than using tuples, really.

@neilisfragile neilisfragile merged commit a0d0ba7 into develop Oct 11, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants